Fitting a SEIR model to COVID-19 data from Lombardy, Italy

17 April 2020

Robert Perrotta

We use the pymc3 probabilistic programming library to fit a simplified SEIR model to the COVID-19 data recorded for Lombardy, Italy by the Protezione Civile and made available at https://github.com/pcm-dpc/COVID-19. Model assumptions ares discussed and the quality of the fit model is examined.

Analyzing the model fit

The model trace contains samples from the posterior for all our parameters. After discarding the burn-in period and sub-sampling to get greater statistical independence between samples, we can use these parameter sets to generate plausible model configurations. For each model state, instead of a single best-fit trace, we get a distribution of traces. Because probability density is not very intuitive, we instead map each trace to a probability on the cumulative distribution of our samples, then compute the tail probability, i.e. the probability of the true value being farther than the model median.

Modeled total confirmed cases

Our model makes the following distribution of predictions for total confirmed cases, which we observe to be well fit to the confirmed cases in the data. The plots below show the model predictions through the first of June assuming the current policies remain in effect. The bottom plot is identical to the top except that it's y-axis is log-scaled.

In [10]:
plot
Out[10]:

Modeled unknown cases

Our model predicts the following distribution of unconfirmed cases.

In [12]:
plot
Out[12]:

Modeled number of deaths from known cases

In [14]:
plot
Out[14]:

Model assumptions and simplifications

  • No resusceptibility
  • No birth and no death except from COVID-19
  • Model parameters are constant over time except transmission rate between unconfirmed cases, which change twice -- once on February 22nd when Lombardy was first put under lockdown and again on March 8th when Italy shut down all non-essential businesses nation-wide.

Model limitations

  • No attempt to compensate for reporting lag.

Possible next steps

  • Hold-out latest data to assess quality of predictions
  • Develop more sophisticated models of reporting error
  • Use model to predict possible outcomes of lifting restrictions